compare-commits.sh: support both llama-bench and test-backend-ops #14392

yeahdongcn · 2025-06-26T11:26:26Z

Make sure to read the contributing guidelines before submitting a PR

This is a follow-up to #14368, adding support for comparing test-backend-ops performance results between two commits.

Testing Done

Generated Tables

❯ cd /Users/yexiaodong/go/src/github.com/ggerganov/llama.cpp && python3 scripts/compare-llama-bench.py -b 1d5f25c53 -c ecd7fdb4c --tool test-backend-ops -i ./test-backend-ops.sqlite
| Backend   | Operation   | Parameters                              |   Bandwidth (GB/s) 1d5f25c53 |   Bandwidth (GB/s) xd/compare |   Speedup |
|:----------|:------------|:----------------------------------------|-----------------------------:|------------------------------:|----------:|
| Metal     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,1,1,1]   |                        28.42 |                         28.45 |      1.00 |
| Metal     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,512,1,1] |                        86.60 |                         97.14 |      1.12 |

❯ cd /Users/yexiaodong/go/src/github.com/ggerganov/llama.cpp && python3 scripts/compare-llama-bench.py -b ecd7fdb4c -c ecd7fdb4c --tool test-backend-ops -i ./test-backend-ops.sqlite                 
| Backend   | Operation   | Parameters                                                                       |   GFLOPS xd/test-backend-ops_sql |   GFLOPS xd/test-backend-ops_sql |   Speedup |
|:----------|:------------|:---------------------------------------------------------------------------------|---------------------------------:|---------------------------------:|----------:|
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=128,n=1,k=16416,bs=[8,1],nr=[4,1],per=[0,1,2,3],v=1      |                           127.90 |                           127.90 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=16416,n=1,k=128,bs=[8,1],nr=[4,1],per=[0,2,1,3],v=0      |                            33.98 |                            33.98 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=4096,n=1,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0     |                            57.91 |                            57.91 |      1.00 |
| Metal     | MUL_MAT     | type_a=f16,type_b=f32,m=4096,n=2,k=14336,bs=[1,1],nr=[1,1],per=[0,1,2,3],v=0     |                           115.00 |                           115.00 |      1.00 |

Generated Plot

Full Logs

test-backend-ops:

root@deccddc39743:/ws# CMAKE_OPTS="-DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21" ./scripts/compare-commits.sh 40a6430eb 3ad0161af test-backend-ops -o ADD
+ commit1=40a6430eb
+ commit2=3ad0161af
+ tool=test-backend-ops
+ additional_args='-o ADD'
+ '[' test-backend-ops '!=' llama-bench ']'
+ '[' test-backend-ops '!=' test-backend-ops ']'
+ ./scripts/compare-llama-bench.py --check
+ '[' test-backend-ops = llama-bench ']'
+ db_file=test-backend-ops.sqlite
+ target=test-backend-ops
+ run_args='perf --output sql -o ADD'
+ rm -f test-backend-ops.sqlite
+ '[' -n '' ']'
+ dir=build-bench
+ git checkout 40a6430eb
Note: switching to '40a6430eb'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 40a6430eb Update README.md
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S . -DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21
++ nproc
+ cmake --build build-bench -t test-backend-ops -j 12
+ build-bench/bin/test-backend-ops perf --output sql -o ADD
+ sqlite3 test-backend-ops.sqlite
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 MUSA devices:
  Device 0: MTT S80, compute capability 2.1, VMM: yes
+ git checkout 3ad0161af
Previous HEAD position was 40a6430eb Update README.md
HEAD is now at 3ad0161af musa: apply mublas API changes
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S . -DGGML_MUSA=ON -DMUSA_ARCHITECTURES=21
++ nproc
+ cmake --build build-bench -t test-backend-ops -j 12
+ build-bench/bin/test-backend-ops perf --output sql -o ADD
+ sqlite3 test-backend-ops.sqlite
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 MUSA devices:
  Device 0: MTT S80, compute capability 2.1, VMM: yes
+ ./scripts/compare-llama-bench.py -b 40a6430eb -c 3ad0161af --tool test-backend-ops -i test-backend-ops.sqlite
| Backend   | Operation   | Parameters                              |   Bandwidth (GB/s) xd/compare-commits |   Bandwidth (GB/s) 3ad0161af |   Speedup |
|:----------|:------------|:----------------------------------------|--------------------------------------:|-----------------------------:|----------:|
| MUSA0     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,1,1,1]   |                                  4.53 |                         4.53 |      1.00 |
| MUSA0     | ADD         | type=f32,ne=[4096,1,1,1],nr=[1,512,1,1] |                                246.99 |                       247.09 |      1.00 |

llama-bench:

❯ ./scripts/compare-commits.sh 7ae027c03 7736d6426 llama-bench -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999
+ commit1=7ae027c03
+ commit2=7736d6426
+ tool=llama-bench
+ additional_args='-m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999'
+ '[' llama-bench '!=' llama-bench ']'
+ ./scripts/compare-llama-bench.py --check
+ '[' llama-bench = llama-bench ']'
+ db_file=llama-bench.sqlite
+ target=llama-bench
+ run_args='-o sql -oe md -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999'
+ rm -f llama-bench.sqlite
+ '[' -n '' ']'
+ dir=build-bench
+ git checkout 7ae027c03
Previous HEAD position was 7736d6426 Apply suggestion from @JohannesGaessler
HEAD is now at 7ae027c03 Update README.md
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S .
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:77 (message):
  OpenMP not found
Call Stack (most recent call first):
  ggml/src/CMakeLists.txt:361 (ggml_add_cpu_backend_variant_impl)


++ nproc
+ cmake --build build-bench -t llama-bench -j 8
+ build-bench/bin/llama-bench -o sql -oe md -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999
+ sqlite3 llama-bench.sqlite
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           pp512 |        105.58 ± 1.85 |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           tg128 |          9.32 ± 1.50 |

build: 7ae027c03 (5844)
+ git checkout 7736d6426
Previous HEAD position was 7ae027c03 Update README.md
HEAD is now at 7736d6426 Apply suggestion from @JohannesGaessler
+ run
+ rm -fr build-bench
+ cmake -B build-bench -S .
CMake Warning at ggml/src/ggml-cpu/CMakeLists.txt:77 (message):
  OpenMP not found
Call Stack (most recent call first):
  ggml/src/CMakeLists.txt:361 (ggml_add_cpu_backend_variant_impl)


++ nproc
+ cmake --build build-bench -t llama-bench -j 8
+ sqlite3 llama-bench.sqlite
+ build-bench/bin/llama-bench -o sql -oe md -m /Users/yexiaodong/models/DeepSeek-R1-Distill-Qwen-7B-Q4_K_M.gguf -ngl 999
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           pp512 |        101.41 ± 9.71 |
| qwen2 7B Q4_K - Medium         |   4.36 GiB |     7.62 B | Metal,BLAS |       4 |           tg128 |         11.11 ± 0.99 |

build: 7736d6426 (5843)
+ ./scripts/compare-llama-bench.py -b 7ae027c03 -c 7736d6426 --tool llama-bench -i llama-bench.sqlite
| CPU                  | Model           | Test   |   t/s xd/compare-commits |   t/s 7736d6426 |   Speedup |
|:---------------------|:----------------|:-------|-------------------------:|----------------:|----------:|
| Accelerate, Apple M1 | qwen2 7B Q4_K_M | pp512  |                   105.58 |          101.41 |      0.96 |
| Accelerate, Apple M1 | qwen2 7B Q4_K_M | tg128  |                     9.32 |           11.11 |      1.19 |

Copilot

Pull Request Overview

This PR adds support for comparing performance results from both llama-bench and test-backend-ops by introducing tool-specific database schemas, CLI argument parsing, and formatting functions. Key changes include:

Refactoring database field and key property definitions to support both tools.
Updating table queries and input file handling based on a new --tool argument.
Enhancing the CLI script (compare-commits.sh) to allow selection of the tool and additional arguments.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
scripts/compare-llama-bench.py	Adjusts SQLite table creation, queries, and result formatting for dual tool support.
scripts/compare-commits.sh	Updates argument parsing and build/run logic to handle multiple tools.

scripts/compare-llama-bench.py

yeahdongcn · 2025-07-05T04:14:21Z

Hi @JohannesGaessler @slaren @ggerganov I’ve merged #14368 into master. Could you please continue reviewing this one when you have a moment? Thanks!

scripts/compare-llama-bench.py

Signed-off-by: Xiaodong Ye <[email protected]>

Co-authored-by: Johannes Gäßler <[email protected]>

Signed-off-by: Xiaodong Ye <[email protected]>

yeahdongcn requested a review from Copilot June 26, 2025 11:28

Copilot AI reviewed Jun 26, 2025

View reviewed changes

scripts/compare-llama-bench.py Show resolved Hide resolved

github-actions bot added script Script related python python script changes labels Jun 26, 2025

yeahdongcn requested review from ggerganov, slaren and JohannesGaessler June 26, 2025 11:32

yeahdongcn marked this pull request as ready for review June 26, 2025 11:33

yeahdongcn mentioned this pull request Jul 1, 2025

test-backend-ops: add support for specifying output format #14368

Merged

yeahdongcn force-pushed the xd/compare-commits branch from 5c1951b to b5ea15f Compare July 5, 2025 04:15

JohannesGaessler reviewed Jul 7, 2025

View reviewed changes

yeahdongcn force-pushed the xd/compare-commits branch 2 times, most recently from 88b3c64 to 5e8f738 Compare July 8, 2025 02:07

yeahdongcn and others added 4 commits July 8, 2025 10:09

compare-commits.sh: support both llama-bench and test-backend-ops

e561fbc

Signed-off-by: Xiaodong Ye <[email protected]>

Speed up the build by specifying -j 12

ebf518d

Signed-off-by: Xiaodong Ye <[email protected]>

Remove build_number from test-backend-ops db

5c5fc53

Signed-off-by: Xiaodong Ye <[email protected]>

Apply suggestion from @JohannesGaessler

7736d64

Co-authored-by: Johannes Gäßler <[email protected]>

yeahdongcn force-pushed the xd/compare-commits branch from 5e8f738 to 7736d64 Compare July 8, 2025 02:09

Refine tool selection logic

a7940f7

Signed-off-by: Xiaodong Ye <[email protected]>

yeahdongcn requested a review from JohannesGaessler July 10, 2025 00:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compare-commits.sh: support both llama-bench and test-backend-ops #14392

compare-commits.sh: support both llama-bench and test-backend-ops #14392

yeahdongcn commented Jun 26, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

yeahdongcn commented Jul 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

compare-commits.sh: support both llama-bench and test-backend-ops #14392

Are you sure you want to change the base?

compare-commits.sh: support both llama-bench and test-backend-ops #14392

Conversation

yeahdongcn commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing Done

Generated Tables

Generated Plot

Full Logs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

yeahdongcn commented Jul 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yeahdongcn commented Jun 26, 2025 •

edited

Loading